SYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein
نویسندگان
چکیده
We have integrated the protein families from SYSTERS and the expressed sequence tag (EST) clusters from our database GeneNest with SpliceNest, a new database mapping EST contigs into genomic DNA. The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT, TrEMBL and PIR databases into disjoint protein family and superfamily clusters. GeneNest is a database and software package for producing and visualizing gene indices from ESTs and mRNAs. Currently, the database comprises gene indices of human, mouse, Arabidopsis thaliana and zebrafish. SpliceNest is a web-based graphical tool to explore gene structure, including alternative splicing, based on a mapping of the EST consensus sequences from GeneNest to the complete human genome. The integration of SYSTERS, GeneNest and SpliceNest into one framework now permits an overall exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA. The databases are available for querying and browsing at http://cmb.molgen.mpg.de.
منابع مشابه
The SYSTERS Protein Family Database in 2005
The SYSTERS project aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure. A refined two-step algorithm assigns each protein to a family and a superfamily. The sequence data underlying SYSTERS release 4 now comprise several protein sequence databases derived from completely sequenced genomes (ENSEMBL, TAIR, SGD and GeneDB), in addition to t...
متن کاملThe SYSTERS Protein Family Web Server: Shortcut from large-scale sequence information to phylogenetic information SYSTERS superfamily 114462 comprises most of the Cation efflux domain proteins in Arabidopsis thaliana
With this poster [11], we present the SYSTERS protein family database, an attempt to classify all available protein sequences. In particular, we focus on the capability of the web interface to assist in in-depth analyses of special protein families. We demonstrate this by an analysis of a specific family of transmembraneous metal ion transport proteins characterised by the so called cation effl...
متن کاملGenome wide identification and classification of alternative splicing based on EST data
MOTIVATION Alternative splicing is currently seen to explain the vast disparity between the number of predicted genes in the human genome and the highly diverse proteome. The mapping of expressed sequences tag (EST) consensus sequences derived from the GeneNest database onto the genome provides an efficient way of predicting exon-intron boundaries, gene structure and alternative splicing events...
متن کاملT-STAG: resource and web-interface for tissue-specific transcripts and genes
T-STAG (tissue-specific transcripts and genes) is a resource and web-interface, designated to analyze tissue/tumor-specific expression patterns in human and mouse transcriptomes. It integrates our refined prediction of specific expression patterns both in genes as well as in individual isoforms with man-mouse orthology data. In combination with the features for combining/contrasting the genes e...
متن کاملWWW access to the SYSTERS protein sequence cluster set
SUMMARY We present a Web server where the SYSTERS cluster set of the non-redundant protein database consisting of sequences from SWISS-PROT and PIR is being made available for querying and browsing. The cluster set can be searched with a new sequence using the SSMAL search tool. Additionally, a multiple alignment is generated for each cluster and annotated with domain information from the Pfam ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Nucleic acids research
دوره 30 1 شماره
صفحات -
تاریخ انتشار 2002